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Abstract 

We study the conformational equilibria of two peptides using a 
novel statistical mechanics approach designed for calculating free en- 
ergy differences between highly dis-similar conformational states. Our 
results elucidate the contrasting roles of entropy in implicitly solvated 
leucine dipeptide and decaglycine. The method extends earlier work by 
Voter, and overcomes the notorious "overlap" problem in free energy 
computations by constructing a mathematically equivalent calculation 
with high conformational similarity. The approach requires only equi- 
librium simulations of the two states of interest, without the need for 
sampling transition states. We discuss extensions of the approach to 
binding affinity estimation and explicitly solvated systems, as well as 
possible optimizations. 



1 Introduction 

Although free energy differences (AG) are fundamental to the description of 
every molecular process, computer-based estimation of AG remains among 
the most difficult and time-consuming tasks in computational chemistry and 
biology. Despite the obstacles, molecular mechanics AG calculations are 
presently applied for protein engineering, 1 ' 2 drug design, 3, toxicology stud- 
ies, 5 solubility estimation, 6 ' 7 and determining binding affinities of ligands 
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to proteins. Although many ad hoc and approximate methods are in wide 
use, 9 including docking protocols, 10-14 there is a widely recognized need to 
more rigorously include molecular flexibility and entropic effects. 10-14 Ad- 
ditionally, the development of improved, polarizable molecular mechanics 
forcefields 6 ' 15-17 suggests rigorous, ensemble-based AG estimates will play 
an increasing role in elucidating subtle molecular effects — aided by the de- 
creasing cost of computer resources. 

Our concern here is to calculate the conformational free energy differ- 
ences AG and entropy differences A5 for peptides with a method based on 
statistical mechanics, which fully accounts for molecular flexibility. Because 
these techniques attempt a full accounting of entropic and enthalpic effects, 
they are computationally demanding — and it is the need for greater effi- 
ciency which we address here. Our ideas are framed and tested for molecular 
mechanics calculations, but they should also prove applicable with quantum 
mechanical computation. 18 

Present statistical mechanics methods can be classified into equilibrium 
and non-equilibrium approaches. Equilibrium AG calculations, such as ther- 
modynamic integration and multi-stage free energy perturbation, have been 
used successfully for many years. 19-22 Equilibrium methods can be very ac- 
curate, but have a large computational cost. On the other hand, Jarzynski's 
equality 23 has recently has opened up a host of non-equilibrium AG meth- 
ods. 24-28 Non-equilibrium methods can provide rapid estimates for AG, but 
typically suffer from bias without careful convergence testing. 29-32 Single- 
stage approaches, 7,33 may be considered non-equilibrium calculations which 
are based upon simulating only one or both end states of interest. 

Poor "overlap" — the lack of conformational similarity between the molec- 
ular states of interest — is a key cause of computational expense in statistical 
mechanics methods. This phenomenon is illustrated schematically in Figure 
n Unfortunately, in many cases of interest, poor overlap is the rule rather 
than the exception. Such dis-similarity is implicit in the critical problem of 
conformational equilibria, for instance. 

The most common approach to improve overlap in free energy calcu- 
lations is that used in thermodynamic integration, namely, simulating the 
system at multiple hybrid, intermediate stages (e.g., Refs. 6,19-22,34,35). 
An alternative strategy was taken by van Gunsteren and collaborators 36 
and McCammon and collaborators 33 who built non-physical intermediate 
reference states that increased overlap between the two end states. A third 
noteworthy solution to the overlap problem is the computation of absolute 
free energies for each of the end states, avoiding any need for configurational 
overlap. 3 
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Here, we address the overlap problem using a two-decade-old, elegant 
idea from solid-state systems. 41 In Ref. 41, Voter pointed out that shifting 
potential energy functions (or coordinates) by a constant vector in configu- 
ration space can dramatically improve the overlap of the states while main- 
taining the value of AG exactly; he applied this idea to tungsten dimers on 
a crystal surface. A similar approach was adopted for lattice systems by 
Bruce and collaborators. 42 In their "metric scaling" approach, Reinhardt 
and collaborators also built on Voter's idea for slow-growth AG estimates 
for crystal lattice changes. 43 Most recently, in a generalization of Voter's 
approach, Jarzynski introduced "targeted free energy perturbation," and 
applied the method to a Lennard-Jones fluid. 44 However, Ref. 44 also in- 
dicates that constructing the mapping is likely to be as hard as estimating 
the free energy itself, and 

Here, we introduce the first generalization of Voter's shifting strategy 
specifically tailored to molecular calculations. Because no intermediate 
stages are required, the approach has the potential to dramatically reduce 
computation times. Overlap is obtained by shifting internal coordinates of 
the molecule. The shifting strategy is further combined with Bennett's iter- 
ative approach 45 to efficiently utilize the data. These two ideas, (i) shifting 
internal coordinates and (ii) the use of Bennett's iterative method, provide 
the backbone of the single-stage shifting approach presented here. 

In outline, this report first builds the necessary mathematical frame- 
work for the single-stage shifting method in Section [21 Then in Section 
01 the method is tested on leucine dipeptide (ACE-(leu)2-NME) in GBSA 
implicit solvent, correctly calculating the free energy difference AG and en- 
tropy difference AS between the alpha and beta conformations. The leucine 
dipeptide system is also used to demonstrate various shifting approaches, 
and show the importance of using Bennett's iterative method. Finally, the 
single-stage shifting method is used to predict AG and AS between the 
alpha and extended conformations of decaglycine (ACE-(gly)io-NME) in 
GBSA solvent. 

While our results already demonstrate notable efficiency (the decaglycine 
calculations, for instance, would be extremely costly by conventional stag- 
ing methods), we have not yet pursued a number of fairly clear avenues 
for optimization; these are discussed in Section 0] Finally, the extension 
of the single-stage shifting method to explicitly solvated systems, and for 
"alchemical" mutations, is explored in Section El 



3 



2 The single-stage shifting method 



In this section we introduce a single-stage method which improves the over- 
lap between the end states by construction. In essence, a mathematically 
equivalent calculation, with high overlap, is performed instead of addressing 
the original problem. 



2.1 Constructing an equivalent shifted AG calculation 

Consider two end states defined by potential energy functions Uq(x) and 
Ui(x), where if is a set of configurational coordinates. The two states could 
be two different conformations, or the bound and unbound states of a pro- 
tein. For problems in conformational equilibria, as studied below, Uq and U\ 
can be the same potential function restricted to different regions of config- 
urational space (i.e., two conformational states). The free energy difference 
AG between these two states is given by 

0AG _ ZlU^x)} _ f dx e-^W 

Z[U {x)} fdxe-WoW { ' 

where (3 = 1/ksT, and Z indicates a partition function, and integration is 
to be performed over the indicated potential function. Note that for implicit 
solvation, as is used in this study, the Gibbs free energy is equivalent to the 
Helmholtz free energy. Using the formalism of free energy perturbation, 46 
we can re-write eq (0) as 



e — /3AG = : ^ — > = ( e-PL^iW-uoWj ) , (2) 



[dx ( e -P[Ui(x)-U (x)}\ -/3Uo(x) I 



J dx e-W*) 



o 



where the (...)o indicates an equilibrium average over Uq configurations. 

Voter's shifting strategy 41 improves the overlap between states (potential 
functions) without altering AG. Imagine shifting the origin of the configu- 
rational coordinates x in U\ by a constant vector G. This corresponds to a 
simple change of variables, x — > x + C, and Z\U\(x + C)\ = Z\U\{xj\ remains 
unchanged because integration is performed over all space. Eq © can now 
be written as 



/ dx (e ' M < 



e -/3AG = J \ _± _ / g -/3[C/i(ffi+C)-C/ (2 




In principle, eq (J3J implies that one can arbitrarily shift the coordinates for 
the Uq configurations needed for the average. (Shifting Uq rather than U\ 
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produces an analogous result.) In practice, the shifting should be done to 
maximize the overlap between configurations in Uq and U\. Unfortunately, 
no Cartesian shift vector can bring two distinct molecular configurations into 
overlap. Therefore, we apply the Voter idea in internal coordinates — and we 
term the approach single-stage shifting. 

To demonstrate the reasoning behind a shift in internal coordinates, con- 
sider the schematic torsional potentials shown in Figure ^i,b. Simulations 
for U\ will mainly sample the large "trans" well at (ft = 0, while simulations 
for Uq will mainly sample the large "gauche" well at (ft ~ —150. Shifting 
Ui by a constant corresponding to the difference between the minima of the 
two potentials (i.e., roughly 150 degrees for this example) produces Figure 
{Ifc, where the overlap between (ft configurations is excellent. Such a shift 
does not alter the U\ partition function and thus AG is unaffected. Note 
that, below, we shift internal coordinates rather than potentials, but these 
are equivalent procedures. 

In practice, the dimensionality of the system will be high and the shifting 
constant will not be as obvious as that in Figure ^ Reasonable methods for 
determining the shifting constant will be discussed in Section [2 .Ml 



2.2 Utilizing bi-directional data — Bennett's Methods 

In single-stage free energy perturbation (eq ©), only configurations from 
Uq are evaluated using U\. However, if simulations are performed in both 
states of interest, one could just as readily evaluate configurations from U\ 
using Uq. Bennett showed that the "bi-directional" (Uq — > U\ and U\ — > Uq) 
evaluations could be combined to minimize the uncertainty in AG. 45 

Bennett introduced both "iterative" and "acceptance-ratio" methods to 
utilize bi-directional data. 45 Generalizing the acceptance-ratio formulation 
to the shifting approach yields ' 



/ minf 1.0, e 



-p{u x (jl+d)-Uo&))^ \ 



l 



Similarly, generalizing to the iterative method to shifted coordinates leads 
to the following relation 45 



\ + e -p(u (g)-U 1 (3+d)+AG)' S j l J = ^ 1 + e -/3(f 1 (SKoM)-AG)j ^ 
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Since AG is on both sides of eq it must be solved in an iterative fashion. 
Eq. in its un-shifted form, has been shown to be the optimal use of bi- 
directional data. 35 ' 45 ' 47 ' 48 

Below we use the single-stage shifting method to calculate AG using eqs 
Q and ® with internal coordinates shifted by a constant G, which 
is chosen to maximize the overlap between Uq and U\ states. For the sys- 
tems studied below, leucine dipeptide and decaglycine, we find the iterative 
method of eq (JSJ) to be the most efficient use of the raw data. 

2.3 Practical implementation in molecular systems 

Determining a shifting constant G in eq © involves making a decision about 
the subset of coordinates to shift. There is generally a minimum number of 
coordinates needed. For example, for leucine dipeptide, we define the alpha 
and beta conformations using two backbone torsion angles; thus these two 
torsions must be shifted. In practice, it is also possible to shift too many 
coordinates, leading to steric clashes. This will be shown below for peptides. 

Once the subset of shifted internal coordinates has been determined, the 
shift constant G must be calculated for all coordinates in the subset. In 
Figure we chose to shift according to the minimum of Uq and U\. This 
leads to our first approach: after equilibrium simulation in each potential, 
find the lowest energy frame (snapshot) for both Uq and U\\ then choose the 
constant vector G which aligns the two lowest-energy frames, 

G = y min -yr n , (6) 

where y represents the subset of coordinates (e.g., only torsions) of the 
configuration x. Strictly speaking, G is a vector of the same dimensionality 
as x, with zero for every component not present in y. Another reasonable 
choice for a shift constant is realized by generating a histogram for each 
coordinate in y for both Uq and U\. The shift constant is then chosen to 
align the peaks of these histograms, 

c = y ™ de -yi mode , ( 7 ) 

Below, we shift using both of these choices for various internal coordinate 
subsets, and compare the results. 

Procedurally, an estimate of AG is generated using the following steps: 

1. Generate an equilibrium trajectory in both states of interest, i.e., using 
Uq and U\. Determine, as discussed above, the subset of coordinates 
to shift y, and the shifting vector G using eq (jHJ) or (JJJ). 
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2. For each frame in the Uq trajectory, shift the internal coordinates by 
C (e.g., (f>2 — > <t>2 + 90.0). Then, for each Uq frame, x, evaluate and 
record U\(x + C) — Uq(x). 

3. Repeat step 2 for each frame in the U\ ensemble with one important 
difference — the frames must be shifted in the opposite direction (e.g., 
4> 2 -> ^2 - 90.0), yielding U (x - C) - TJ x {x). 

4. Solve eq ©, @ and/or ® to determine a AG estimate. 

Below, we utilize this process to generate multiple AG estimates, after which 
the mean AG and standard deviation are calculated. 

When shifting internal coordinates care must be taken to shift the coordi- 
nates so that the partition function remains unchanged. Using the standard 
internal coordinates for bond length r, bond angle 9 and dihedrals u, a 
differential volume element in configuration space is given by 

dx\ = r\dr\ ■sm9\d9\ dw\ = d(rf/3) d(cos9i) duj\, (8) 

where x\ represents one possible set of internal coordinates. Equation (JHJ) 
holds for every possible set of internal coordinates, and thus implies that the 
shifting of internal coordinates must be done according to simple rules. Bond 
lengths must be shifted according to the cube of the length: r 3 — > r 3 — r 3 , 
where r c is the bond length shifting constant (i.e., one component of the 
vector G) found either by the peaks from histograms of r 3 and using eq , 
or by comparing minimum energy frames and using eq ©• Similarly, bond 
angles must be shifted using cos — > cos 9 — cos 9 C , and dihedrals are shifted 
via lo — > u — u> c . 

3 Results for peptides 

To test the effectiveness of the single-stage shifting method, we performed 
all-atom simulations of leucine dipeptide (ACE-(leu)2-NME) and decaglycine 
(ACE-(gly)io-NME). Both peptides were simulated using the TINKER Ver- 
sion 4.2 molecular dynamics package. 49 The peptides were solvated im- 
plicitly using the generalized Born surface area (GBSA) approach 50 and 
Langevin dynamics were utilized with the friction coefficient set to that of 
water (91.0 psec -1 ). A time step of 1.0 fsec was chosen for all simulations. 
Leucine dipeptide was maintained at 500.0 K (to enable independent verifi- 
cation of our result) and utilized the CHARMM27 forcefield parameter set. 51 
Decaglycine was maintained at 300.0 K and used the AMBER96 forcefield 
parameter set 52 for comparison with Ref. 37. 
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3.1 Leucine dipeptide 

Leucine dipeptide (ACE-(leu)2-NME) was chosen as a test system because 
(i) it possesses some of the complexity of a large molecule — four backbone 
torsions and eight side-chain dihedrals; and (ii) it is small enough to allow 
for very long simulation times. Long simulation times are necessary for 
an unbiased, independent determination of the conformational population 
(hence AG), providing a strict test of the single-stage shifting method. 

For leucine dipeptide we calculated the free energy difference AG and 
entropy difference AS for the alpha — > beta conformational change. We 
defined the alpha and beta conformations using two internal backbone tor- 
sions, namely, for alpha: —145 < <p2 < — 25 and — 125 < ipi < —5; and 
for beta: -160 < 2 < -40 and 70 < tpi < -170. A temperature of 500 
K was chosen to enable repeated crossing of the free energy barrier between 
the alpha and beta conformations. At 500 K, leucine dipeptide switches be- 
tween alpha and beta conformations at a rate of around 2.5 transitions per 
nsec with GBSA solvation. Note that this high temperature is required only 
to obtain an unbiased estimate of AG — our single-stage shifting approach 
works equally well at lower temperatures. 

To obtain an independent and unbiased value of AG, four 1.0 usee simu- 
lations were performed yielding around 2500 transition events per trajectory. 
The four trajectories were then used to calculate AG via the definition 



where A a i p ha and Abota are, respectively, the number of dynamics steps the 
system was in the alpha and beta conformations. The unbiased value of 
AG for alpha — > beta was found to be AG = 0.95 ± 0.05 kcal/mole, where 
the value of AG given is the average of the four 1.0 /isec estimates with 
uncertainty given by the standard deviation. 

The unbiased entropy difference was calculated using eq via TAS = 
AU avg — AG where AC/ av g is the average of L^eta — C^aipha- We found that 
TAS" was zero within uncertainty. This is consistent with our observation 
that the fluctuating degrees of freedom for leucine dipeptide are mainly the 
side-chain torsions, and thus do not change dramatically between the alpha 
and beta conformations. Given the overall AG value, the alpha conformation 
is favored to to intra-molecular attractions. 

With an unbiased value of AG from eq ©, it is possible to test various 
implementations of the single-stage shifting method. To this end, we simu- 
lated leucine dipeptide in the alpha and beta conformations, generating four 




(9) 
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1.0 nsec trajectories for each conformation. Each of the trajectories were 
obtained by constraining the backbone torsions to stay within the defined 
ranges for alpha and beta, given above. The constraining force was zero if 
the torsion angles were within the allowed range, and harmonic otherwise. 
Frames were saved every 0.1 psec yielding 10,000 frames per trajectory. 

The four trajectories provide sixteen independent single-stage shifting 
estimates of AG and AS using all possible pairings. The results for AG 
are summarized in Table ^ where we used both the lowest energy frame 
and histogram peak shifting approaches (Section 12. M|) to estimate AG for 
leucine dipeptide. The value of AS was found to be zero within uncertainty, 
consistent with the value found using eq ©• For each shifting approach, we 
tested four sets of shifting coordinates: backbone torsions only, all torsions, 
all torsions and bond angles, or all internal coordinates (torsions, angles and 
bond lengths). In addition, we also show results using both the iterative 
method of eq © and the acceptance-ratio method of eq (HJ). The values 
for AG are averages with standard deviations shown in parentheses. Our 
single-stage shifting results in Tabled agree well with the unbiased estimate 
given above. The table also show that shifting torsions results in a small 
uncertainty, while including bond angles and lengths makes the AG estimate 
less certain. 

In all of our simulations, we have found that shifting torsions (either 
backbone only, or all torsions), and using the iterative method of eq ([5]). has 
consistently provided accurate results. Also, as demonstrated in Table ^ 
lower uncertainty is obtained by shifting according to histogram peaks and 
using eq (J7J) rather than shifting by the lowest energy frames. 

We also stress the importance of using bi-directional data to determine 
AG for conformational equilibria. To this end, in Figure HI we employed the 
single-stage shifting method using three data analysis protocols: single-stage 
free energy perturbation in both directions (eq Q), 46 Bennett's acceptance- 
ratio method (eq (JIJ)) and Bennett's iterative method (eq (JSJ)). 45 The su- 
perior convergence properties of the iterative method can be seen in Figure 
12 The solid horizontal black line represents the independent, 4.0 //sec AG 
value obtained by using eq © • The data are AG estimates using the single- 
stage shifting method, where backbone torsions were shifted according to 
the histogram peaks. The data for the figure was generated using a single 
trajectory in each of the alpha and beta conformations. The red curve was 
generated from Bennett's acceptance-ratio method, the green dashed curve 
is free energy perturbation in the forward direction (i.e., AG a i p h a ^beta)) an d 
the green solid curve is free energy perturbation in the reverse direction (i.e., 
— AGbeta^aipha)- Finally the solid blue curve is Bennett's iterative method. 
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It is clear from the figure that using bi-directional data is very important to 
the success of our single-stage shifting method. Further, Bennett's iterative 
method is shown to converge more quickly than the other methods. 

Figure 01 demonstrates the efficiency of the single-stage shifting method 
compared to long simulation and use of eq ®. The horizontal black line 
is the unbiased AG from long (4.0 //sec) simulation. The curve shows the 
average (blue squares) and standard deviation (errorbars) of the single-stage 
shifting method where backbone torsions were shifted by the histogram 
peaks, and the iterative method of eq (JJJJ) was used to analyze the data. 
In our unconstrained simulations, leucine dipeptide switched between the 
alpha and beta conformations, on average, once every 400 psec. Using our 
single-stage shifting method, with only 30 psec of simulation (15 psec in 
alpha and 15 psec in alpha), a reasonably accurate and precise value of AG 
can be obtained. 

3.2 Decaglycine 

We also applied the single-stage shifting method to decaglycine (ACE-(gly)io- 
NME), predicting the conformational AG and entropy difference AS for al- 
pha — ► extended conformations. We again defined the alpha and extended 
conformations by the internal backbone torsions (i.e., excluding <f>\ and ipio), 
namely, for alpha: —115 < < 5 and — 115 < ip < 5; and for extended: 
120 < (f> < -120 and 120 < tp < -120. Previous AG and AS calculations 
of decaglycine in vacuum were performed by Karplus and Kushick, 39 and 
quite recently by Cheluvaraja and Meirovitch. 37 Apparently, decaglycine's 
conformational AG and AS have not previously been computed in implicit 
solvent. 

To calculate AG and AS", four trajectories in each conformation were 
generated using the simulation parameters defined previously. Thus, sixteen 
independent AG and AS" estimates can be calculated. Each trajectory was 
1.0 nsec in length with a frame saved every 0.1 psec yielding 10,000 frames 
per trajectory — although this may be quite sub-optimal; see Section 0J As 
with leucine dipeptide, the sixteen AG and AS estimates were generated 
using various shifting approaches. 

Table |21 shows the results of our calculation of AG and AS for the alpha 
— > extended conformational change using various shifting approaches. The 
entropy change was estimated via TAS = At7 avg — AG where AC/ avg is 
the average of U ex t — £4ipha- Acceptance ratio estimates had a much larger 
uncertainty then the iterative method estimates and thus are not included in 
the table. The results clearly demonstrate that shifting by histogram peaks 
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provides a higher level of precision than shifting by the lowest energy frame. 
Also, as with leucine dipeptide, lower uncertainty is obtained when shifting 
torsions only (i.e., not bond angles and lengths). 

The results in Table |2] suggest that the "compensating" role of the en- 
tropy is vital for an accurate AG calculation in GBSA solvent. Our studies 
of decaglycine in vacuum (data not shown), as well those of other groups ' 39 
show that the alpha conformation is more stable than extended, due mainly 
to energetics. However, our results suggest that, with the addition of (im- 
plicit) solvent, the energy difference between the two conformations becomes 
small enough that the entropy term dominates AG — to the degree that the 
extended conformation is more stable than alpha. 

Figure |1] shows AG and TAS as a function simulation time. The data 
points are the average of the sixteen independent estimates with standard de- 
viation given by the error bars for both AG (blue squares) and TAS (green 
circles). These estimates were obtained using the iterative method, and 
shifting the backbone torsions by the histogram peak. The figure demon- 
strates the apparent convergence of the AG and TAS estimates. Note that 
it would be impractical to obtain an independent estimate for decaglycine 
(as we did with leucine dipeptide) because of the required simulation times. 

In the current implementation, a total of 8 nsec of simulation was re- 
quired to obtain a reasonably accurate and precise estimate of AG. To 
our knowledge, no multi-stage calculation has ever been attempted on this 
system, undoubtedly due to the prohibitive computational expense. Never- 
theless, we believe additional optimization of our current shifting protocol 
will be possible, as we now discuss. 

Finally, we note that, as in any AG computation, our results reflect the 
definitions chosen for the alpha and extended states. 

4 Further optimization of the single-stage shifting 
approach 

While the present results indicate that peptide conformational equilibria 
can be determined by sub- nsec simulations, several promising avenues for 
optimization have not been explored. We briefly sketch several possible ap- 
proaches for improving efficiency, including combining the shifting approach 
with traditional staged calculations. 

It is useful to consider the upper limit for the computational cost of the 
single-stage shifting procedure. The maximum is essentially twice that of 
equilibrium simulation, provided the method is hard-wired into the molec- 
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ular dynamics program, due to the extra energy call that must be made for 
each shifted frame. In the current study, the shifting procedure was scripted 
external to the simulation program (TINKER), and thus the cost for tra- 
jectory analysis was high — limiting the number of frames per trajectory to 
10,000. We found that, for this fixed number of frames per trajectory, the 
simulation time between frames had very little affect on the AG estimate. 
Thus, we feel that substantial increases in efficiency could be realized by 
utilizing every frame in the AG calculation — i.e., every time step. (It is 
worthwhile to recall that interactions change enough over a single time step 
to require re-calculation of forces.) Ultimately, then, reliable and accurate 
AG calculations should take no more time than required to sample the equi- 
librium ensemble in a given state (the minimum time for any AG method). 

Additionally, the shifting procedures explored in this report ignore corre- 
lation between coordinates, such as those known from Ramachandran plots, 
where backbone torsions <j) and ip do not vary independently. Ramachan- 
dran plots, moreover, average over many residues, which individually are 
likely more correlated. To motivate more general shifts, consider one state 
where a certain pair of 4>, tjj angles inhabit a predominantly vertical region of 
Ramachandran space, while the other state populates a region with a very 
different orientation. In such a case, simple shifts alone (e.g., those in eqs 
(jHJ) and (J7J)) will not maximize overlap to the extent that a combined shift 
and (partition- function-preserving) rotation would. Further, if one oblong 
well is very narrow and the other well is very broad, then coordinate scaling 
(contraction/expansion) should also be performed (see also Refs. 43,44). In 
general the coordinates can be linearly scaled, rotated and/or translated 
using a constant matrix A (i.e., x — > Ax + G), and eqs ®, © and © 
must be generalized to account for the matrix A. More complex, nonlinear 
transformations are also possible, but may not be practical. 

In larger systems than those considered here (e.g., whole proteins), the 
gain in overlap due to the internal coordinate shift may prove insufficient 
to permit reliable single-stage computation of AG values. In such cases, 
it may be advantageous to combine the single-stage shifting approach with 
multi-stage methodology. To do so, a path connecting the two states of 
interest can be defined (e.g., Refs. 53-55), and independent trajectories can 
be generated at intermediate stages along the path. Then, between each 
successive intermediate stage, the incremental free energy difference (6G) 
is estimated using the shifting protocol outlined in Section 12.31 The 5G 
estimates can then be summed to obtain the full AG for the complete path. 
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5 Extension to "alchemical" calculations and ex- 
plicitly solvated systems 

It is possible to extend the formalism of the single-stage shifting method 
beyond conformational AG calculation for implicitly solvated molecules. In 
this section we outline the potential for the single-stage shifting method to 
be used for "alchemical" mutations — which are the basis for relative binding 
affinities and solubilities 56 — and on explicitly solvated systems. 

For alchemical mutation, two distinct potential energy functions Uq and 
U\ — one for each molecule — are used in eqs (J3J), Q and ©. While the math- 
ematical formalism is unchanged from conformational AG calculations, the 
difficulty in alchemical mutations lies in determining the shifting vector G, 
since the number of degrees of freedom for Uq and U\ are different, in gen- 
eral. (For conformational AG calculations, such as those above, the number 
of degrees of freedom for Uo and U\ are always the same.) This difficulty 
can be overcome by introducing "dummy" coordinates as in Ref. 54. Al- 
though dummy coordinates will change the absolute free energy values, use 
of a thermodynamic cycle guarantees that the free energy difference AG will 
remain unchanged. Thus, accurate relative binding affinities and solubilities 
can be obtained. 54 

If explicit solvation is used for conformational or alchemical AG calcu- 
lation, then the single-stage shifting method must be generalized to include 
non-standard mtermolecular "external" coordinates. This can, in principle, 
be accomplished by introducing auxiliary vectors to describe the location 
and orientation (e.g., Euler angles) of each solvent or solute molecule. The 
necessary shifting vector G will now include the full set of intra- and inter- 
molecular coordinates. 

6 Conclusion 

In a study of peptide equilibria, we have demonstrated a simple method for 
substantially overcoming the overlap problem in calculating free energy dif- 
ferences (AG) and entropy difference (AS) between conformational states. 
The new single-stage shifting method utilizes a shift in internal coordinates 
to improve the overlap between configurations, motivated by Voter's study 
in Ref. 41. The approach requires only simulation in the two states of in- 
terest without the need for "staged" intermediate calculations. Bennett's 
iterative method 45 is used to efficiently calculate a AG value from the raw 
data. 
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We tested the single-stage shifting approach on two peptides, obtaining 
excellent results with sub-nsec simulation times. First, for leucine dipep- 
tide in implicit solvent, we accurately calculated the conformational AG 
for alpha — > beta conformations — judging by nearly perfect agreement with 
a 4.0 /usee simulation. The entropy difference AS* was found to be nearly 
zero, also consistent with long simulation. The single-stage shifting method 
was then used to predict the conformational AG and AS for alpha — > ex- 
tended conformations of decaglycine in implicit solvent, apparently for the 
first time. We find that, with implicit solvent, the extended conformation 
of decaglycine is favored over alpha, due mainly to the entropy gain in the 
extended state. By contrast, in vacuum, the alpha conformation is preferred 
due mainly to the strongly favorable intra-molecular interactions. It must 
be borne in mind that our quantitative results necessarily depend on our 
state definitions. 

While the present report describes a single type of application of the 
single-stage shifting approach (to conformational equilibria), we believe the 
idea will find quite broad applications — in part due to the substantial po- 
tential for further optimization. We have therefore discussed optimization of 
the method, and application to "alchemical" mutations (for relative binding 
affinities), as well as the use of explicit solvent. The single-stage shifting ap- 
proach may also be combined with multi-stage simulation, allowing further 
optimization. We are currently exploring these ideas. 
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Shifting 
approach 


Shifted coordinates 


Iterative 
AG (kcal/mol) 


Accept, ratio 
AG (kcal/mol) 


Peak of 
histogram 
(eq ©) 


Backbone torsions only 
All torsions 

All torsions and bond angles 
All internal coordinates 


1.08 (0.15) 
0.56 (0.87) 
0.76 (2.64) 
0.78 (2.63) 


1.02 (0.13) 
0.55 (0.86) 
1.01 (3.25) 
1.09 (3.26) 


Lowest 
energy- 
frames 
(eq ©) 


Backbone torsions only 
All torsions 

All torsions and bond angles 
All internal coordinates 


1.29 (0.35) 
0.99 (0.66) 
3.36 (4.75) 
4.50 (7.16) 


1.30 (0.39) 
0.98 (0.66) 
6.48 (9.45) 
8.80 (14.32) 



Table 1: Conformational free energy differences AG for alpha — > beta 
conformations for leucine dipeptide in GBSA solvent using several variations 
of the single-stage shifting approach. Each shifting AG result is an average 
of sixteen independent 2.0 nsec estimates of AG with standard deviation 
shown in parentheses. The iterative results use eq © and the acceptance- 
ratio results use eq @. For comparison, the value of AG obtained from 
long simulation (4.0 /xsec) is 0.95 (0.05) kcal/mole. The entropy difference 
AS, found by single-stage shifting, was zero within uncertainty, consistent 
with the value obtained by long simulation. This table also demonstrates 
that shifting too many coordinates (i.e., bond angles and lengths) leads to 
steric clashes, and thus a larger uncertainty for AG. 
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Shifting 
approach 


Shifted coordinates 


Iterative 
AG (kcal/mol) 


Iterative 
TAS (kcal/mol) 


Peak of 
histogram 
(eq ©) 


Backbone torsions only 
All torsions 

All torsions and bond angles 
All internal coordinates 


-12.39 (0.47) 
-12.59 (0.82) 
-12.77 (2.94) 
-12.75 (3.11) 


9.75 (0.66) 
9.95 (0.93) 
10.12 (2.97) 
10.10 (3.13) 


Lowest 
energy 
frames 
(eq ©) 


Backbone torsions only 
All torsions 

All torsions and bond angles 
All internal coordinates 


-11.97 (1.59) 
-12.12 (2.57) 
-18.53 (11.14) 
-24.79 (14.03) 


9.34 (1.71) 
9.48 (2.76) 
15.89 (10.93) 
22.15 (13.86) 



Table 2: Conformational free energy differences AG and entropy differ- 
ences TAS for alpha — ► extended conformations of decaglycine in GBSA 
solvent using several shifting approaches. Each shifting result is an average 
of sixteen independent 2.0 nsec estimates with standard deviation shown in 
parentheses, using the iterative method of eq (JSJ). Acceptance ratio results 
(data not shown) had a much larger uncertainty than the iterative method. 
This data clearly shows that lower uncertainty is obtained by shifting via 
histogram peaks rather than by the lowest energy frame. Also, shifting non- 
torsional coordinates (i.e., bond angles and lengths) dramatically increases 
uncertainty. 
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Figure 1: Illustration of the shifting idea using two idealized torsional po- 
tentials (a) Uq(4>) and (b) U\((j)). There is minimal configurational overlap 
between these two potentials, since simulations will mainly sample the deep, 
dis-similar minima of the potentials, (c) With the use of an appropriate shift- 
ing constant, it is possible to construct overlap between the states without 
altering the numerical value of the conformational free energy difference. 
Note that for peptides, we shift internal coordinates rather than potentials 
(see Section 123)) . however, these two procedures are equivalent. 
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Figure 2: Comparison between several shifting methods for the conforma- 
tional change in free energy for alpha — » beta of leucine dipeptide in GBSA 
solvent. The alpha (top left, "cis-like") and beta (top right, "trans- like" ) 
conformations are also shown, with the backbone carbons colored black. 
Two trajectories (one in each conformation) were analyzed using several 
protocols of the single-stage shift technique. The solid horizontal black line 
represents the 4.0 usee value of AG obtained by using eq ©• The curves 
are obtained using the single-stage shifting method where backbone torsions 
were shifted according to the peak of the histogram (i.e., lowest uncertainty 
AG estimate in Table The red curve was generated from Bennett's 
acceptance-ratio method, the green dashed curve is free energy perturba- 
tion in the forward direction (i.e., AG a i p ha-+bota), and the green solid curve 
is free energy perturbation in the reverse direction (i.e., — AGbeta^aipha)- 
The solid blue curve is Bennett's iterative method. The figure clearly sug- 
gests the superior convergence properties of the iterative method for our 
single-stage shifting data. 
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Figure 3: Conformational free energy difference AG for alpha — > beta 
conformations of leucine dipeptide in GBSA solvent, shown as a function 
of the simulation time. The horizontal black line shows the known AG 
obtained using 4.0 //sec of simulation and eq @. The data were obtained 
using the single-stage shifting method where backbone torsions were shifted 
according to the peak of the histogram, and the iterative method of eq 
was used to analyze the data. The average AG estimates (blue squares) and 
standard deviations (errorbars) were calculated using sixteen independent 
estimates of AG. Using the single-stage shifting method, with only 30 psec of 
simulation (15 psec in alpha and 15 psec in beta), the average estimate of AG 
is within 1.0 kcal/mol of the known value with a standard deviation of less 
than 1.0 kcal/mol. This may be compared with unconstrained simulations, 
in which transitions between alpha and beta conformations occurred, on 
average, every 400 psec. 
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Figure 4: Conformational free energy difference AG and entropy difference 
TAS for the alpha — > extended conformational change in decaglycine with 
GBSA solvent. The alpha (top left) and beta (top right) conformations are 
also shown. The data were obtained from eight independent simulations — 
four constrained to be in alpha and four in extended. From the eight simula- 
tions, sixteen independent estimates of AG were obtained, and the averages 
(blue squares) and standard deviations (error bars) were calculated as a 
function of simulation time. Each AG estimate was generated using the end 
point shifting method by shifting backbone torsions via histogram peaks, 
and using the iterative method of eq © • 
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